Addressing Malicious Noise in Clickthrough Data
نویسنده
چکیده
Clickthrough logs are becoming an increasingly used source of training data for learning ranking functions. Due to the large impact that the position in search results has on commercial websites, malicious noise is bound to appear in search engine click logs. We present preliminary work in addressing this form of noise, that we term click-spam. We analyze click-spam from a utility standpoint, and investigate the idea of whether personalizing web search results by partitioning the user population can reduce or eliminate the financial incentives for potential spammers. We formalize click-spam and analyze the incentives for malicious agents, then investigate the model with some examples.
منابع مشابه
Improving Relevance Prediction by Addressing Biases and Sparsity in Web Search Click Data
In this paper, we present our approach and findings in participating the 2012 Yandex Relevance Prediction Challenge. Our approach has two goals: on one hand, we aim to address four types of biases, namely, position-bias, perception-bias, query-bias, and session-bias to better interpret the clickthrough information; on the other hand, we aim to address the clickthrough sparsity by exploiting var...
متن کاملCWI at the Photo Retrieval Task of ImageCLEF 2009
CWI’s experiments investigate the usefulness of clickthrough data for improving the diversity of image retrieval results. We use the search logs provided to us by Belga to find relevant images; we consider that these correspond to images clicked for queries exactly matching or best matching a topic’s title and cluster titles. To reduce the noise, we also filter these results and only consider t...
متن کاملQuery Session Data vs. Clickthrough Data as Query Suggestion Resources
Query suggestion has become one of the most fundamental features of Web search engines. Some query suggestion algorithms utilize query session data, while others utilize clickthrough data. The objective of this study is to examine which of these two resources can provide more effective query suggestions. Our results show that query session data outperforms clickthrough data in terms of clickthr...
متن کاملLearning Phrase-Based Spelling Error Models from Clickthrough Data
This paper explores the use of clickthrough data for query spelling correction. First, large amounts of query-correction pairs are derived by analyzing users' query reformulation behavior encoded in the clickthrough data. Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system. Experiments...
متن کاملSpying Out Accurate User Preferences for Search Engine Adaptation
Most existing search engines employ static ranking algorithms that do not adapt to the specific needs of users. Recently, some researchers have studied the use of clickthrough data to adapt a search engine’s ranking function. Clickthrough data indicate for each query the results that are clicked by users. As a kind of implicit relevance feedback information, clickthrough data can easily be coll...
متن کامل